Overview

Dataset statistics

Number of variables20
Number of observations16610
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.5 MiB
Average record size in memory160.0 B

Variable types

NUM19
BOOL1

Reproduction

Analysis started2020-06-23 21:30:19.721728
Analysis finished2020-06-23 21:32:06.868181
Duration1 minute and 47.15 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

df_index has unique values Unique
view has 14914 (89.8%) zeros Zeros
sqft_basement has 9959 (60.0%) zeros Zeros
yr_renovated has 15863 (95.5%) zeros Zeros

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct count16610
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8306.098254063818
Minimum0
Maximum16612
Zeros1
Zeros (%)< 0.1%
Memory size129.8 KiB

Quantile statistics

Minimum0
5-th percentile830.45
Q14153.25
median8306.5
Q312458.75
95-th percentile15781.55
Maximum16612
Range16612
Interquartile range (IQR)8305.5

Descriptive statistics

Standard deviation4795.840442
Coefficient of variation (CV)0.5773878776
Kurtosis-1.199873842
Mean8306.098254
Median Absolute Deviation (MAD)4153
Skewness-0.0001451354083
Sum137964292
Variance23000085.55
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20471< 0.1%
 
88171< 0.1%
 
47751< 0.1%
 
68221< 0.1%
 
6771< 0.1%
 
27241< 0.1%
 
129631< 0.1%
 
150101< 0.1%
 
88651< 0.1%
 
109121< 0.1%
 
Other values (16600)1660099.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
166121< 0.1%
 
166111< 0.1%
 
166101< 0.1%
 
166091< 0.1%
 
166081< 0.1%
 

price
Real number (ℝ≥0)

Distinct count3298
Unique (%)19.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean533528.465141481
Minimum75000.0
Maximum7700000.0
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum75000
5-th percentile209680
Q1315000
median447000
Q3638925
95-th percentile1145000
Maximum7700000
Range7625000
Interquartile range (IQR)323925

Descriptive statistics

Standard deviation366543.1737
Coefficient of variation (CV)0.6870170903
Kurtosis40.8661537
Mean533528.4651
Median Absolute Deviation (MAD)152000
Skewness4.319542239
Sum8861907806
Variance1.343538982e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3500001470.9%
 
4500001330.8%
 
5500001180.7%
 
5000001150.7%
 
3250001140.7%
 
4000001100.7%
 
3750001080.7%
 
2500001080.7%
 
5250001060.6%
 
4250001060.6%
 
Other values (3288)1544593.0%
 
ValueCountFrequency (%) 
750001< 0.1%
 
780001< 0.1%
 
800001< 0.1%
 
810001< 0.1%
 
820001< 0.1%
 
ValueCountFrequency (%) 
77000001< 0.1%
 
70625001< 0.1%
 
68850001< 0.1%
 
55700001< 0.1%
 
53500001< 0.1%
 

bedrooms
Real number (ℝ≥0)

Distinct count13
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.3659843467790487
Minimum0
Maximum33
Zeros11
Zeros (%)0.1%
Memory size129.8 KiB

Quantile statistics

Minimum0
5-th percentile2
Q13
median3
Q34
95-th percentile5
Maximum33
Range33
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.937192558
Coefficient of variation (CV)0.2784304564
Kurtosis61.5292295
Mean3.365984347
Median Absolute Deviation (MAD)1
Skewness2.385437079
Sum55909
Variance0.8783298908
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3760745.8%
 
4525631.6%
 
2212312.8%
 
511847.1%
 
62231.3%
 
11560.9%
 
7320.2%
 
0110.1%
 
8100.1%
 
94< 0.1%
 
Other values (3)4< 0.1%
 
ValueCountFrequency (%) 
0110.1%
 
11560.9%
 
2212312.8%
 
3760745.8%
 
4525631.6%
 
ValueCountFrequency (%) 
331< 0.1%
 
111< 0.1%
 
102< 0.1%
 
94< 0.1%
 
8100.1%
 

bathrooms
Real number (ℝ≥0)

Distinct count29
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.062086092715232
Minimum0.0
Maximum8.0
Zeros9
Zeros (%)0.1%
Memory size129.8 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11.5
median2
Q32.5
95-th percentile3.3875
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7588425967
Coefficient of variation (CV)0.3679975339
Kurtosis1.465347347
Mean2.062086093
Median Absolute Deviation (MAD)0.5
Skewness0.5597552469
Sum34251.25
Variance0.5758420866
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2.5387023.3%
 
1321619.4%
 
1.75253315.2%
 
2.2515739.5%
 
215619.4%
 
1.511496.9%
 
2.758755.3%
 
35493.3%
 
3.54492.7%
 
3.253812.3%
 
Other values (19)4542.7%
 
ValueCountFrequency (%) 
090.1%
 
0.54< 0.1%
 
0.75550.3%
 
1321619.4%
 
1.254< 0.1%
 
ValueCountFrequency (%) 
82< 0.1%
 
7.751< 0.1%
 
7.51< 0.1%
 
6.752< 0.1%
 
6.251< 0.1%
 

sqft_living
Real number (ℝ≥0)

Distinct count787
Unique (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2053.884105960265
Minimum370
Maximum13540
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum370
5-th percentile930
Q11415.5
median1899.5
Q32500
95-th percentile3720
Maximum13540
Range13170
Interquartile range (IQR)1084.5

Descriptive statistics

Standard deviation904.4817744
Coefficient of variation (CV)0.440376247
Kurtosis6.377088606
Mean2053.884106
Median Absolute Deviation (MAD)529.5
Skewness1.574624776
Sum34115015
Variance818087.2803
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
14001100.7%
 
13001090.7%
 
15601070.6%
 
10101050.6%
 
17201040.6%
 
14401020.6%
 
16601000.6%
 
12501000.6%
 
1820990.6%
 
1480990.6%
 
Other values (777)1557593.8%
 
ValueCountFrequency (%) 
3701< 0.1%
 
3801< 0.1%
 
3901< 0.1%
 
4101< 0.1%
 
4202< 0.1%
 
ValueCountFrequency (%) 
135401< 0.1%
 
120501< 0.1%
 
100401< 0.1%
 
98901< 0.1%
 
96401< 0.1%
 

sqft_lot
Real number (ℝ≥0)

Distinct count8064
Unique (%)48.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15739.596929560506
Minimum520
Maximum1651359
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum520
5-th percentile2800
Q15455
median7902.5
Q311070.75
95-th percentile44379.5
Maximum1651359
Range1650839
Interquartile range (IQR)5615.75

Descriptive statistics

Standard deviation41957.99892
Coefficient of variation (CV)2.6657607
Kurtosis295.395126
Mean15739.59693
Median Absolute Deviation (MAD)2702.5
Skewness13.1911913
Sum261434705
Variance1760473673
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
50002851.7%
 
60002251.4%
 
40001831.1%
 
72001771.1%
 
75001050.6%
 
9600930.6%
 
8400900.5%
 
4500890.5%
 
4800890.5%
 
9000760.5%
 
Other values (8054)1519891.5%
 
ValueCountFrequency (%) 
5201< 0.1%
 
5721< 0.1%
 
6001< 0.1%
 
6091< 0.1%
 
6491< 0.1%
 
ValueCountFrequency (%) 
16513591< 0.1%
 
10742181< 0.1%
 
10240681< 0.1%
 
9829981< 0.1%
 
9822781< 0.1%
 

floors
Real number (ℝ≥0)

Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4324503311258279
Minimum1.0
Maximum3.5
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile2
Maximum3.5
Range2.5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.5096580185
Coefficient of variation (CV)0.3557945483
Kurtosis-0.5415793027
Mean1.432450331
Median Absolute Deviation (MAD)0
Skewness0.6923517377
Sum23793
Variance0.2597512958
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1896053.9%
 
2567934.2%
 
1.515929.6%
 
32691.6%
 
2.51050.6%
 
3.55< 0.1%
 
ValueCountFrequency (%) 
1896053.9%
 
1.515929.6%
 
2567934.2%
 
2.51050.6%
 
32691.6%
 
ValueCountFrequency (%) 
3.55< 0.1%
 
32691.6%
 
2.51050.6%
 
2567934.2%
 
1.515929.6%
 

waterfront
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size129.8 KiB
0
16478
1
 
132
ValueCountFrequency (%) 
01647899.2%
 
11320.8%
 

view
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.24388922335942204
Minimum0
Maximum4
Zeros14914
Zeros (%)89.8%
Memory size129.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum4
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7814686712
Coefficient of variation (CV)3.204195169
Kurtosis10.32891579
Mean0.2438892234
Median Absolute Deviation (MAD)0
Skewness3.318248224
Sum4051
Variance0.6106932841
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
01491489.8%
 
27654.6%
 
34052.4%
 
12661.6%
 
42601.6%
 
ValueCountFrequency (%) 
01491489.8%
 
12661.6%
 
27654.6%
 
34052.4%
 
42601.6%
 
ValueCountFrequency (%) 
42601.6%
 
34052.4%
 
27654.6%
 
12661.6%
 
01491489.8%
 

condition
Real number (ℝ≥0)

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4497290788681516
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum1
5-th percentile3
Q13
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.6666100385
Coefficient of variation (CV)0.1932354754
Kurtosis0.2145281474
Mean3.449729079
Median Absolute Deviation (MAD)0
Skewness0.897160679
Sum57300
Variance0.4443689434
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
31021461.5%
 
4480728.9%
 
514258.6%
 
21410.8%
 
1230.1%
 
ValueCountFrequency (%) 
1230.1%
 
21410.8%
 
31021461.5%
 
4480728.9%
 
514258.6%
 
ValueCountFrequency (%) 
514258.6%
 
4480728.9%
 
31021461.5%
 
21410.8%
 
1230.1%
 

grade
Real number (ℝ≥0)

Distinct count11
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.592414208308248
Minimum3
Maximum13
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum3
5-th percentile6
Q17
median7
Q38
95-th percentile10
Maximum13
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.168418697
Coefficient of variation (CV)0.1538929074
Kurtosis1.424443286
Mean7.592414208
Median Absolute Deviation (MAD)1
Skewness0.859343715
Sum126110
Variance1.365202251
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
7727843.8%
 
8441126.6%
 
9181410.9%
 
6170610.3%
 
108024.8%
 
112871.7%
 
52001.2%
 
12730.4%
 
4240.1%
 
13120.1%
 
ValueCountFrequency (%) 
33< 0.1%
 
4240.1%
 
52001.2%
 
6170610.3%
 
7727843.8%
 
ValueCountFrequency (%) 
13120.1%
 
12730.4%
 
112871.7%
 
108024.8%
 
9181410.9%
 

sqft_above
Real number (ℝ≥0)

Distinct count710
Unique (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1750.221733895244
Minimum370
Maximum9410
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum370
5-th percentile840
Q11180
median1530
Q32140
95-th percentile3310
Maximum9410
Range9040
Interquartile range (IQR)960

Descriptive statistics

Standard deviation804.7895581
Coefficient of variation (CV)0.4598214858
Kurtosis4.075936661
Mean1750.221734
Median Absolute Deviation (MAD)430
Skewness1.5530449
Sum29071183
Variance647686.2329
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
10101761.1%
 
13001741.0%
 
12001651.0%
 
12201510.9%
 
14001470.9%
 
13401450.9%
 
12501430.9%
 
11401410.8%
 
10601380.8%
 
11801380.8%
 
Other values (700)1509290.9%
 
ValueCountFrequency (%) 
3701< 0.1%
 
3801< 0.1%
 
3901< 0.1%
 
4101< 0.1%
 
4202< 0.1%
 
ValueCountFrequency (%) 
94101< 0.1%
 
88601< 0.1%
 
85701< 0.1%
 
78801< 0.1%
 
76801< 0.1%
 

sqft_basement
Real number (ℝ≥0)

ZEROS

Distinct count271
Unique (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean303.6623720650211
Minimum0
Maximum4820
Zeros9959
Zeros (%)60.0%
Memory size129.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3600
95-th percentile1200
Maximum4820
Range4820
Interquartile range (IQR)600

Descriptive statistics

Standard deviation450.7357924
Coefficient of variation (CV)1.484332054
Kurtosis2.712912315
Mean303.6623721
Median Absolute Deviation (MAD)0
Skewness1.537486622
Sum5043832
Variance203162.7546
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0995960.0%
 
5001891.1%
 
7001851.1%
 
8001771.1%
 
6001711.0%
 
4001520.9%
 
10001200.7%
 
9001190.7%
 
3001070.6%
 
530860.5%
 
Other values (261)534532.2%
 
ValueCountFrequency (%) 
0995960.0%
 
402< 0.1%
 
506< 0.1%
 
608< 0.1%
 
702< 0.1%
 
ValueCountFrequency (%) 
48201< 0.1%
 
41301< 0.1%
 
35001< 0.1%
 
34801< 0.1%
 
32601< 0.1%
 

yr_built
Real number (ℝ≥0)

Distinct count116
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1967.2948223961469
Minimum1900
Maximum2015
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum1900
5-th percentile1914
Q11950
median1969
Q31990
95-th percentile2005
Maximum2015
Range115
Interquartile range (IQR)40

Descriptive statistics

Standard deviation27.93517441
Coefficient of variation (CV)0.01419979054
Kurtosis-0.6049668121
Mean1967.294822
Median Absolute Deviation (MAD)20
Skewness-0.4591374584
Sum32676767
Variance780.3739692
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
19773442.1%
 
20043442.1%
 
20033392.0%
 
19783292.0%
 
19683231.9%
 
19672951.8%
 
20052901.7%
 
19592881.7%
 
19792881.7%
 
19902741.6%
 
Other values (106)1349681.3%
 
ValueCountFrequency (%) 
1900740.4%
 
1901250.2%
 
1902230.1%
 
1903330.2%
 
1904400.2%
 
ValueCountFrequency (%) 
2015100.1%
 
20141040.6%
 
2013320.2%
 
2012320.2%
 
2011270.2%
 

yr_renovated
Real number (ℝ≥0)

ZEROS

Distinct count70
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean89.75225767609874
Minimum0
Maximum2015
Zeros15863
Zeros (%)95.5%
Memory size129.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum2015
Range2015
Interquartile range (IQR)0

Descriptive statistics

Standard deviation413.6230914
Coefficient of variation (CV)4.608497904
Kurtosis17.29380256
Mean89.75225768
Median Absolute Deviation (MAD)0
Skewness4.392066465
Sum1490785
Variance171084.0617
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
01586395.5%
 
2014780.5%
 
2003320.2%
 
2005300.2%
 
2013280.2%
 
2000280.2%
 
1990240.1%
 
2007240.1%
 
2004230.1%
 
2002200.1%
 
Other values (60)4602.8%
 
ValueCountFrequency (%) 
01586395.5%
 
19341< 0.1%
 
19402< 0.1%
 
19441< 0.1%
 
19453< 0.1%
 
ValueCountFrequency (%) 
2015130.1%
 
2014780.5%
 
2013280.2%
 
201290.1%
 
2011110.1%
 

zipcode
Real number (ℝ≥0)

Distinct count70
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98078.16520168573
Minimum98001
Maximum98199
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum98001
5-th percentile98004
Q198033
median98065
Q398118
95-th percentile98177
Maximum98199
Range198
Interquartile range (IQR)85

Descriptive statistics

Standard deviation54.15373311
Coefficient of variation (CV)0.0005521487173
Kurtosis-0.8790295035
Mean98078.1652
Median Absolute Deviation (MAD)42
Skewness0.4089065237
Sum1629078324
Variance2932.62681
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
980524572.8%
 
980384562.7%
 
981154552.7%
 
980344382.6%
 
981174222.5%
 
980424182.5%
 
981034122.5%
 
980234082.5%
 
980064002.4%
 
981333992.4%
 
Other values (60)1234574.3%
 
ValueCountFrequency (%) 
980012661.6%
 
980021450.9%
 
980032301.4%
 
980042451.5%
 
980051510.9%
 
ValueCountFrequency (%) 
981992431.5%
 
981982301.4%
 
981881170.7%
 
981782211.3%
 
981772051.2%
 

lat
Real number (ℝ≥0)

Distinct count4833
Unique (%)29.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.56002361228176
Minimum47.1559
Maximum47.7776
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum47.1559
5-th percentile47.3102
Q147.4646
median47.573
Q347.679075
95-th percentile47.7508
Maximum47.7776
Range0.6217
Interquartile range (IQR)0.214475

Descriptive statistics

Standard deviation0.1397186585
Coefficient of variation (CV)0.002937733162
Kurtosis-0.7208026492
Mean47.56002361
Median Absolute Deviation (MAD)0.1067
Skewness-0.4691442717
Sum789971.9922
Variance0.01952130354
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
47.6711150.1%
 
47.6624140.1%
 
47.6955130.1%
 
47.5427130.1%
 
47.5305120.1%
 
47.6388120.1%
 
47.697120.1%
 
47.5518120.1%
 
47.686120.1%
 
47.5521110.1%
 
Other values (4823)1648499.2%
 
ValueCountFrequency (%) 
47.15591< 0.1%
 
47.15931< 0.1%
 
47.16221< 0.1%
 
47.16471< 0.1%
 
47.17641< 0.1%
 
ValueCountFrequency (%) 
47.77762< 0.1%
 
47.77753< 0.1%
 
47.77741< 0.1%
 
47.77722< 0.1%
 
47.77711< 0.1%
 

long
Real number (ℝ)

Distinct count729
Unique (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-122.2153820590006
Minimum-122.519
Maximum-121.315
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum-122.519
5-th percentile-122.387
Q1-122.327
median-122.231
Q3-122.12725
95-th percentile-121.98545
Maximum-121.315
Range1.204
Interquartile range (IQR)0.19975

Descriptive statistics

Standard deviation0.1386610324
Coefficient of variation (CV)-0.001134562852
Kurtosis1.179309129
Mean-122.2153821
Median Absolute Deviation (MAD)0.099
Skewness0.8915388217
Sum-2029997.496
Variance0.0192268819
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-122.29950.6%
 
-122.3890.5%
 
-122.306780.5%
 
-122.288780.5%
 
-122.372770.5%
 
-122.316760.5%
 
-122.351760.5%
 
-122.291760.5%
 
-122.172750.5%
 
-122.299730.4%
 
Other values (719)1581795.2%
 
ValueCountFrequency (%) 
-122.5191< 0.1%
 
-122.5151< 0.1%
 
-122.5141< 0.1%
 
-122.5121< 0.1%
 
-122.5112< 0.1%
 
ValueCountFrequency (%) 
-121.3152< 0.1%
 
-121.3161< 0.1%
 
-121.3191< 0.1%
 
-121.3211< 0.1%
 
-121.3251< 0.1%
 

sqft_living15
Real number (ℝ≥0)

Distinct count621
Unique (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1974.796447922938
Minimum399
Maximum6110
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum399
5-th percentile1140
Q11490
median1830
Q32330
95-th percentile3260
Maximum6110
Range5711
Interquartile range (IQR)840

Descriptive statistics

Standard deviation673.0288617
Coefficient of variation (CV)0.3408092325
Kurtosis1.655339029
Mean1974.796448
Median Absolute Deviation (MAD)400
Skewness1.114691539
Sum32801369
Variance452967.8487
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
15401611.0%
 
15601591.0%
 
14401480.9%
 
15001400.8%
 
16101340.8%
 
14601330.8%
 
16601320.8%
 
15101290.8%
 
16201290.8%
 
14801280.8%
 
Other values (611)1521791.6%
 
ValueCountFrequency (%) 
3991< 0.1%
 
4601< 0.1%
 
6201< 0.1%
 
7001< 0.1%
 
7102< 0.1%
 
ValueCountFrequency (%) 
61101< 0.1%
 
57903< 0.1%
 
56101< 0.1%
 
56001< 0.1%
 
55001< 0.1%
 

sqft_lot15
Real number (ℝ≥0)

Distinct count7282
Unique (%)43.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13288.307224563516
Minimum660
Maximum871200
Zeros0
Zeros (%)0.0%
Memory size129.8 KiB

Quantile statistics

Minimum660
5-th percentile3060
Q15421.5
median7823
Q310326
95-th percentile38092.9
Maximum871200
Range870540
Interquartile range (IQR)4904.5

Descriptive statistics

Standard deviation27458.70054
Coefficient of variation (CV)2.066380621
Kurtosis122.8017425
Mean13288.30722
Median Absolute Deviation (MAD)2423
Skewness8.761807766
Sum220718783
Variance753980235.3
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
50003372.0%
 
40002881.7%
 
60002191.3%
 
72001671.0%
 
75001240.7%
 
48001040.6%
 
8400910.5%
 
8000910.5%
 
5100880.5%
 
4080840.5%
 
Other values (7272)1501790.4%
 
ValueCountFrequency (%) 
6601< 0.1%
 
8871< 0.1%
 
9281< 0.1%
 
9551< 0.1%
 
9721< 0.1%
 
ValueCountFrequency (%) 
8712001< 0.1%
 
5606171< 0.1%
 
4382131< 0.1%
 
4347281< 0.1%
 
4255811< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

df_indexpricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontviewconditiongradesqft_abovesqft_basementyr_builtyr_renovatedzipcodelatlongsqft_living15sqft_lot15
00221900.031.00118056501.0003711800195509817847.5112-122.25713405650
11538000.032.25257072422.000372170400195119919812547.7210-122.31916907639
22180000.021.00770100001.000367700193309802847.7379-122.23327208062
33604000.043.00196050001.000571050910196509813647.5208-122.39313605000
44510000.032.00168080801.0003816800198709807447.6168-122.04518007503
551225000.044.5054201019301.00031138901530200109805347.6561-122.0054760101930
66257500.032.25171568192.0003717150199509800347.3097-122.32722386819
77291850.031.50106097111.0003710600196309819847.4095-122.31516509711
88229500.031.00178074701.000371050730196009814647.5123-122.33717808113
99323000.032.50189065602.0003718900200309803847.3684-122.03123907570

Last rows

df_indexpricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontviewconditiongradesqft_abovesqft_basementyr_builtyr_renovatedzipcodelatlongsqft_living15sqft_lot15
1660016603225000.021.0091096121.000479100198109805847.4297-122.15214109611
1660116604250000.031.001460109141.0004714600195909805847.4511-122.17314908314
1660216605585000.043.253410349392.000492470940199209802747.4590-122.003245039045
1660316606685000.031.75272047201.5004715801140192509810547.6691-122.29016604640
1660416607420000.052.752280103191.000381300980195909817747.7566-122.36323708056
1660516608245000.021.0067016751.000566700196009814447.5918-122.29512201740
1660616609275000.042.001480150001.0004714800195709805547.4312-122.19614508768
1660716610270000.032.00233080001.000371390940198609802347.2958-122.36815707227
1660816611767250.043.00217025002.000381710460199709811547.6742-122.30321704080
1660916612229000.032.00176099001.0004717600194309816647.4783-122.33811909900